I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.
Opening an issue or submitting a pull request on GitHub: https://github.com/isaactpetersen/Fantasy-Football-Analytics-Textbook
Adding an annotation using hypothes.is.
To add an annotation, select some text and then click the
symbol on the pop-up menu.
To see the annotations of others, click the
symbol in the upper right-hand corner of the page.
11 Mixed Models
11.1 Getting Started
11.1.1 Load Packages
11.1.2 Specify Package Options
11.1.3 Load Data
We created the player_stats_weekly.RData and player_stats_seasonal.RData objects in Section 3.21.3.
11.2 Overview of Mixed Models
We will discuss a modeling framework that goes by many terms, including mixed models, mixed-effects models, multilevel models, hierarchical linear models.
11.2.1 Ecological Fallacy
11.2.2 Simpson’s Paradox
11.3 Fantasy Points Per Season by Position, Age, and Experience
Code
player_stats_seasonal_offense_subset <- player_stats_seasonal_offense %>%
filter(position_group %in% c("QB","RB","WR","TE"))
player_stats_seasonal_offense_subset$position[which(player_stats_seasonal_offense_subset$position == "HB")] <- "RB"
player_stats_seasonal_kicking_subset <- player_stats_seasonal_kicking %>%
filter(position == "K")
player_stats_seasonal_offense_subset <- bind_rows(
player_stats_seasonal_offense_subset,
player_stats_seasonal_kicking_subset
)
player_stats_seasonal_offense_subset$player_idFactor <- factor(player_stats_seasonal_offense_subset$player_id)
player_stats_seasonal_offense_subset$positionFactor <- factor(player_stats_seasonal_offense_subset$position)Code
seasons17week <- 2001:2020
seasons18week <- 2021:max(nfl_depthCharts$season, na.rm = TRUE)
endOfSeasonDepthCharts <- nfl_depthCharts %>%
filter((season %in% seasons17week & week == 18) | (season %in% seasons18week & week == 19)) # get end-of-season depth charts
qb1s <- endOfSeasonDepthCharts %>%
filter(position == "QB", depth_team == 1)
fb1s <- endOfSeasonDepthCharts %>%
filter(position == "FB", depth_team == 1)
k1s <- endOfSeasonDepthCharts %>%
filter(position == "K", depth_team == 1)
rb1s <- endOfSeasonDepthCharts %>%
filter(position == "RB", depth_team == 1)
wr1s <- endOfSeasonDepthCharts %>%
filter(position == "WR", depth_team == 1)
te1s <- endOfSeasonDepthCharts %>%
filter(position == "TE", depth_team == 1)
player_stats_seasonal_offense_subsetDepth <- player_stats_seasonal_offense_subset %>%
filter(player_id %in% c(
qb1s$gsis_id,
fb1s$gsis_id,
k1s$gsis_id,
rb1s$gsis_id,
wr1s$gsis_id,
te1s$gsis_id
))11.3.1 Scatterplots of Fantasy Points by Age and Position
Scatterplots are a helpful tool for quickly examining the association between two variables. However, scatterplots—as well as correlation and multiple regression—can hide meaningful associations that differ across units of analysis.
11.3.1.1 Quarterbacks
A scatterplot of Quarterbacks’ fantasy points by age is in Figure 11.1.
Code
plot_scatterplotFantasyPointsByAgeQB <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "QB") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_point(
aes(
text = player_display_name, # add player name for mouse over tooltip
label = season # add season for mouse over tooltip
)) +
geom_smooth(
mapping = aes(
x = age,
y = fantasy_points),
inherit.aes = FALSE
) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Quarterbacks"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_scatterplotFantasyPointsByAgeQB,
tooltip = c("age","fantasy_points","text","label"))Based on the scatterplot (and the bivariate association below), Quarterbacks’ fantasy points appear to increase with age.
Code
Pearson's product-moment correlation
data: age and fantasy_points
t = 3.2358, df = 1051, p-value = 0.001251
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.03914186 0.15877861
sample estimates:
cor
0.09931915
11.3.1.2 Fullbacks
A scatterplot of Fullbacks’ fantasy points by age is in Figure 11.2.
Code
plot_scatterplotFantasyPointsByAgeFB <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "FB") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_point(
aes(
text = player_display_name, # add player name for mouse over tooltip
label = season # add season for mouse over tooltip
)) +
geom_smooth(
mapping = aes(
x = age,
y = fantasy_points),
inherit.aes = FALSE
) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Fullbacks"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_scatterplotFantasyPointsByAgeFB,
tooltip = c("age","fantasy_points","text","label"))Based on the scatterplot, Fullbacks’ fantasy points appear to be relatively stable across ages.
11.3.1.3 Running Backs
A scatterplot of Running Backs’ fantasy points by age is in Figure 11.3.
Code
plot_scatterplotFantasyPointsByAgeRB <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "RB") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_point(
aes(
text = player_display_name, # add player name for mouse over tooltip
label = season # add season for mouse over tooltip
)) +
geom_smooth(
mapping = aes(
x = age,
y = fantasy_points),
inherit.aes = FALSE
) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Running Backs"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_scatterplotFantasyPointsByAgeRB,
tooltip = c("age","fantasy_points","text","label"))Based on the scatterplot, Running Backs’ fantasy points appear to be relatively stable across ages.
11.3.1.4 Wide Receivers
A scatterplot of Wide Receivers’ fantasy points by age is in Figure 11.4.
Code
plot_scatterplotFantasyPointsByAgeWR <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "WR") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_point(
aes(
text = player_display_name, # add player name for mouse over tooltip
label = season # add season for mouse over tooltip
)) +
geom_smooth(
mapping = aes(
x = age,
y = fantasy_points),
inherit.aes = FALSE
) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Wide Receivers"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_scatterplotFantasyPointsByAgeWR,
tooltip = c("age","fantasy_points","text","label"))Based on the scatterplot, Wide Receivers’ fantasy points appear to be relatively stable across ages.
11.3.1.5 Tight Ends
A scatterplot of Tight Ends’ fantasy points by age is in Figure 11.5.
Code
plot_scatterplotFantasyPointsByAgeTE <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "TE") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_point(
aes(
text = player_display_name, # add player name for mouse over tooltip
label = season # add season for mouse over tooltip
)) +
geom_smooth(
mapping = aes(
x = age,
y = fantasy_points),
inherit.aes = FALSE
) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Tight Ends"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_scatterplotFantasyPointsByAgeTE,
tooltip = c("age","fantasy_points","text","label"))Based on the scatterplot (and the bivariate association below), Tight Ends’ fantasy points appear to increase with age.
Code
Pearson's product-moment correlation
data: age and fantasy_points
t = 5.3884, df = 1341, p-value = 8.388e-08
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.09280906 0.19753022
sample estimates:
cor
0.1455774
11.3.2 Plots of Raw Trajectories of Fantasy Points By Age and Player
Scatterplots can be helpful for quickly visualizing the association between two variables. However, as mentioned earlier, scatterplots can hide the association between variables at different units of analysis. For instance, consider that we are trying to predict how a player will perform based on their age. We are interested not only in what the association is between age and fantasy points between players (i.e., a between-person association). We are also interested in what the association is between age and fantasy points within a given player (and within each player; i.e., a within-individual association). Arguably, the within-individual association between age and fantasy points is more relevant to the prediction of performance than the association between age and fantasy points between players. Assuming that the between-player association between age and fantasy points is the same as the within-player association when it is not is an example of the ecological fallacy.
Below, we depict players’ raw trajectories of fantasy points as a function of age. These are known as spaghetti plots. By examining the trajectory for each player, we can get a better understanding of hor performance changes (within an individual) as a function of age.
11.3.2.1 Quarterbacks
A plot of Quarterbacks’ raw fantasy points data by age is in Figure 11.6.
Code
plot_rawFantasyPointsByAgeQB <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "QB") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_line(
aes(
text = player_display_name # add player name for mouse over tooltip
)) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Quarterbacks"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_rawFantasyPointsByAgeQB,
tooltip = c("age","fantasy_points","text"))11.3.2.2 Fullbacks
A plot of Fullbacks’ raw fantasy points data by age is in Figure 11.7.
Code
plot_rawFantasyPointsByAgeFB <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "FB") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_line(
aes(
text = player_display_name # add player name for mouse over tooltip
)) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Fullbacks"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_rawFantasyPointsByAgeFB,
tooltip = c("age","fantasy_points","text"))11.3.2.3 Running Backs
A plot of Running Backs’ raw fantasy points data by age is in Figure 11.8.
Code
plot_rawFantasyPointsByAgeRB <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "RB") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_line(
aes(
text = player_display_name # add player name for mouse over tooltip
)) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Running Backs"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_rawFantasyPointsByAgeRB,
tooltip = c("age","fantasy_points","text"))11.3.2.4 Wide Receivers
A plot of Wide Receivers’ raw fantasy points data by age is in Figure 11.9.
Code
plot_rawFantasyPointsByAgeWR <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "WR") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_line(
aes(
text = player_display_name # add player name for mouse over tooltip
)) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Wide Receivers"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_rawFantasyPointsByAgeWR,
tooltip = c("age","fantasy_points","text"))11.3.2.5 Tight Ends
A plot of Tight Ends’ raw fantasy points data by age is in Figure 11.10.
Code
plot_rawFantasyPointsByAgeTE <- ggplot(
data = player_stats_seasonal_offense_subset %>%
filter(position == "TE") %>%
mutate(
age = round(age, 2),
fantasy_points = round(fantasy_points, 2)
),
mapping = aes(
x = age,
y = fantasy_points,
color = player_id)) +
geom_line(
aes(
text = player_display_name # add player name for mouse over tooltip
)) +
scale_color_viridis(discrete = TRUE) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Tight Ends"
) +
theme_classic() +
theme(legend.position = "none")
ggplotly(
plot_rawFantasyPointsByAgeTE,
tooltip = c("age","fantasy_points","text"))11.3.3 Mixed Models
By accounting for which player each observation comes from using mixed models, we can examine the association between age and fantasy points in a more meaningful way, without violating the assumption in multiple regression that the observations are independent (i.e., that the residuals are uncorrelated).
11.3.3.1 Null Model
Code
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
method [lmerModLmerTest]
Formula: fantasy_points ~ 1 + (1 | player_idFactor)
Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
148044.0 148066.6 -74019.0 148038.0 13529
Scaled residuals:
Min 1Q Median 3Q Max
-6.5321 -0.4328 -0.1682 0.3538 5.5810
Random effects:
Groups Name Variance Std.Dev.
player_idFactor (Intercept) 2197 46.87
Residual 2335 48.32
Number of obs: 13532, groups: player_idFactor, 3358
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 44.6942 0.9595 3834.9436 46.58 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R2m R2c
[1,] 0 0.4847453
[1] 148044
11.3.3.2 Model with Position as Fixed-Effect Predictor
Code
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
method [lmerModLmerTest]
Formula: fantasy_points ~ positionFactor + (1 | player_idFactor)
Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
147766.3 147818.9 -73876.2 147752.3 13525
Scaled residuals:
Min 1Q Median 3Q Max
-6.5274 -0.4477 -0.1408 0.3593 5.5407
Random effects:
Groups Name Variance Std.Dev.
player_idFactor (Intercept) 1940 44.04
Residual 2336 48.33
Number of obs: 13532, groups: player_idFactor, 3358
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 15.536 4.447 3449.114 3.494 0.000482 ***
positionFactorQB 61.285 5.178 3470.831 11.836 < 2e-16 ***
positionFactorRB 37.890 4.787 3499.181 7.915 3.28e-15 ***
positionFactorTE 9.873 4.903 3499.455 2.014 0.044136 *
positionFactorWR 27.271 4.693 3490.629 5.811 6.75e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) pstFQB pstFRB pstFTE
postnFctrQB -0.859
postnFctrRB -0.929 0.798
postnFctrTE -0.907 0.779 0.842
postnFctrWR -0.948 0.814 0.880 0.859
R2m R2c
[1,] 0.06139709 0.487171
positionFactor emmean SE df lower.CL upper.CL
FB 15.5 4.45 2896 6.81 24.3
QB 76.8 2.65 2970 71.62 82.0
RB 53.4 1.77 3241 49.95 56.9
TE 25.4 2.07 3159 21.35 29.5
WR 42.8 1.50 3284 39.86 45.8
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
[1] 147766.3
11.3.3.3 Identify the Best-Fitting Functional Form of Age
11.3.3.3.1 Linear Models
Code
Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
method [lmerModLmerTest]
Formula:
fantasy_points ~ positionFactor + ageCentered20 + (1 | player_idFactor)
Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
71242.9 71297.1 -35613.5 71226.9 6437
Scaled residuals:
Min 1Q Median 3Q Max
-6.3406 -0.4440 -0.1160 0.3999 4.5152
Random effects:
Groups Name Variance Std.Dev.
player_idFactor (Intercept) 2555 50.55
Residual 2682 51.79
Number of obs: 6445, groups: player_idFactor, 1292
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 22.102 8.801 1433.812 2.511 0.012139 *
positionFactorQB 89.929 9.765 1335.477 9.209 < 2e-16 ***
positionFactorRB 47.609 9.254 1353.881 5.145 3.07e-07 ***
positionFactorTE 16.785 9.348 1352.423 1.796 0.072779 .
positionFactorWR 34.404 9.045 1355.844 3.804 0.000149 ***
ageCentered20 -1.495 0.249 5966.317 -6.005 2.02e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) pstFQB pstFRB pstFTE pstFWR
postnFctrQB -0.869
postnFctrRB -0.926 0.829
postnFctrTE -0.913 0.821 0.867
postnFctrWR -0.947 0.848 0.896 0.887
ageCentrd20 -0.180 -0.020 0.033 0.012 0.027
R2m R2c
[1,] 0.09922355 0.5386768
positionFactor emmean SE df lower.CL upper.CL
FB 12.5 8.67 1226 -4.49 29.6
QB 102.5 4.53 1173 93.58 111.3
RB 60.1 3.28 1266 53.72 66.6
TE 29.3 3.53 1253 22.39 36.2
WR 46.9 2.63 1310 41.79 52.1
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
ageCentered20 emmean SE df lower.CL upper.CL
6.4 50.3 2.24 1225 45.9 54.7
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
[1] 71242.93
Code
pointsPerSeason_positionAgeRandomLinearSlopes <- lmerTest::lmer(
fantasy_points ~ positionFactor + ageCentered20 + (1 + ageCentered20 | player_idFactor),
data = player_stats_seasonal_offense_subset,
REML = FALSE,
control = lmerControl(optimizer = "bobyqa")
)
summary(pointsPerSeason_positionAgeRandomLinearSlopes)Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
method [lmerModLmerTest]
Formula:
fantasy_points ~ positionFactor + ageCentered20 + (1 + ageCentered20 |
player_idFactor)
Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
71019.5 71087.2 -35499.8 70999.5 6435
Scaled residuals:
Min 1Q Median 3Q Max
-6.4021 -0.4324 -0.1166 0.3901 4.7118
Random effects:
Groups Name Variance Std.Dev. Corr
player_idFactor (Intercept) 3483.32 59.020
ageCentered20 28.04 5.296 -0.52
Residual 2430.63 49.301
Number of obs: 6445, groups: player_idFactor, 1292
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 24.9438 8.8723 1459.3714 2.811 0.004998 **
positionFactorQB 88.9525 9.7308 1297.2928 9.141 < 2e-16 ***
positionFactorRB 47.3942 9.2153 1322.7062 5.143 3.11e-07 ***
positionFactorTE 16.3712 9.3043 1314.3615 1.760 0.078720 .
positionFactorWR 34.2219 9.0054 1318.7078 3.800 0.000151 ***
ageCentered20 -2.0167 0.3462 716.2648 -5.826 8.59e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) pstFQB pstFRB pstFTE pstFWR
postnFctrQB -0.860
postnFctrRB -0.918 0.828
postnFctrTE -0.904 0.820 0.867
postnFctrWR -0.938 0.848 0.896 0.887
ageCentrd20 -0.237 -0.001 0.039 0.017 0.033
R2m R2c
[1,] 0.09797596 0.5835368
positionFactor emmean SE df lower.CL upper.CL
FB 12.0 8.64 1189 -4.92 29.0
QB 101.0 4.53 1151 92.10 109.9
RB 59.4 3.28 1288 52.99 65.9
TE 28.4 3.52 1242 21.50 35.3
WR 46.3 2.63 1321 41.10 51.4
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
ageCentered20 emmean SE df lower.CL upper.CL
6.4 49.4 2.25 1168 45 53.8
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
[1] 71019.5
11.3.3.3.2 Quadratic Models
Code
pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes <- lmerTest::lmer(
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + (1 + ageCentered20 | player_idFactor),
data = player_stats_seasonal_offense_subset,
REML = FALSE,
control = lmerControl(optimizer = "bobyqa")
)
summary(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes)Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
method [lmerModLmerTest]
Formula:
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic +
(1 + ageCentered20 | player_idFactor)
Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
70665.5 70740.0 -35321.7 70643.5 6434
Scaled residuals:
Min 1Q Median 3Q Max
-6.6477 -0.4430 -0.1063 0.3817 4.7823
Random effects:
Groups Name Variance Std.Dev. Corr
player_idFactor (Intercept) 4050.7 63.645
ageCentered20 61.2 7.823 -0.60
Residual 2157.5 46.449
Number of obs: 6445, groups: player_idFactor, 1292
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -24.02735 9.34231 1728.76809 -2.572 0.0102 *
positionFactorQB 88.81757 9.85929 1344.37099 9.009 < 2e-16 ***
positionFactorRB 51.31903 9.32501 1359.21221 5.503 4.45e-08 ***
positionFactorTE 16.83493 9.41871 1353.13565 1.787 0.0741 .
positionFactorWR 36.99633 9.11732 1357.91920 4.058 5.24e-05 ***
ageCentered20 14.28911 0.89964 4453.56443 15.883 < 2e-16 ***
ageCentered20Quadratic -1.24457 0.05996 3926.72121 -20.756 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) pstFQB pstFRB pstFTE pstFWR agCn20
postnFctrQB -0.828
postnFctrRB -0.889 0.830
postnFctrTE -0.872 0.822 0.869
postnFctrWR -0.907 0.849 0.899 0.889
ageCentrd20 -0.340 -0.004 0.032 0.011 0.027
agCntrd20Qd 0.259 0.007 -0.017 -0.005 -0.014 -0.893
R2m R2c
[1,] 0.1655593 0.6746525
positionFactor emmean SE df lower.CL upper.CL
FB 3.38 8.77 1236 -13.8 20.6
QB 92.19 4.62 1223 83.1 101.3
RB 54.70 3.33 1334 48.2 61.2
TE 20.21 3.58 1293 13.2 27.2
WR 40.37 2.68 1386 35.1 45.6
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
ageCentered20 emmean SE df lower.CL upper.CL
6.4 42.2 2.33 1234 37.6 46.7
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
ageCentered20Quadratic emmean SE df lower.CL upper.CL
51.5 42.2 2.33 1234 37.6 46.7
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
[1] 70665.47
Code
pointsPerSeason_positionAgeRandomLinearRandomQuadraticSlopes <- lmerTest::lmer(
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + (1 + ageCentered20 + ageCentered20Quadratic | player_idFactor),
data = player_stats_seasonal_offense_subset,
REML = FALSE,
control = lmerControl(optimizer = "bobyqa")
)
pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction <- lmerTest::lmer(
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic + (1 + ageCentered20 | player_idFactor),
data = player_stats_seasonal_offense_subset,
REML = FALSE,
control = lmerControl(optimizer = "bobyqa")
)
summary(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction)Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
method [lmerModLmerTest]
Formula:
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic +
positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic +
(1 + ageCentered20 | player_idFactor)
Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
70621.8 70750.5 -35291.9 70583.8 6426
Scaled residuals:
Min 1Q Median 3Q Max
-6.7812 -0.4499 -0.1064 0.3837 4.7504
Random effects:
Groups Name Variance Std.Dev. Corr
player_idFactor (Intercept) 3874.28 62.24
ageCentered20 56.85 7.54 -0.57
Residual 2136.45 46.22
Number of obs: 6445, groups: player_idFactor, 1292
Fixed effects:
Estimate Std. Error df
(Intercept) -23.02774 26.67799 4427.31960
positionFactorQB 65.80803 28.07816 4194.19857
positionFactorRB 62.12817 27.68236 4329.35668
positionFactorTE 20.05015 27.89862 4326.89022
positionFactorWR 29.59854 27.35775 4370.07994
ageCentered20 11.79540 7.43993 4959.04172
ageCentered20Quadratic -0.87148 0.50915 4523.63824
positionFactorQB:ageCentered20 7.19239 7.65334 4914.64310
positionFactorRB:ageCentered20 1.31528 7.76909 4927.20674
positionFactorTE:ageCentered20 -0.85010 7.73970 4938.17986
positionFactorWR:ageCentered20 5.36603 7.64413 4949.66508
positionFactorQB:ageCentered20Quadratic -0.49079 0.51719 4542.27913
positionFactorRB:ageCentered20Quadratic -0.61350 0.53800 4439.78564
positionFactorTE:ageCentered20Quadratic 0.06118 0.52899 4492.88066
positionFactorWR:ageCentered20Quadratic -0.65597 0.52521 4496.71285
t value Pr(>|t|)
(Intercept) -0.863 0.3881
positionFactorQB 2.344 0.0191 *
positionFactorRB 2.244 0.0249 *
positionFactorTE 0.719 0.4724
positionFactorWR 1.082 0.2794
ageCentered20 1.585 0.1129
ageCentered20Quadratic -1.712 0.0870 .
positionFactorQB:ageCentered20 0.940 0.3474
positionFactorRB:ageCentered20 0.169 0.8656
positionFactorTE:ageCentered20 -0.110 0.9125
positionFactorWR:ageCentered20 0.702 0.4827
positionFactorQB:ageCentered20Quadratic -0.949 0.3427
positionFactorRB:ageCentered20Quadratic -1.140 0.2542
positionFactorTE:ageCentered20Quadratic 0.116 0.9079
positionFactorWR:ageCentered20Quadratic -1.249 0.2117
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R2m R2c
[1,] 0.1582759 0.6751097
Code
positionFactor emmean SE df lower.CL upper.CL
FB 7.62 9.43 1344 -10.9 26.1
QB 94.20 4.73 1111 84.9 103.5
RB 46.59 3.71 1375 39.3 53.9
TE 25.37 3.78 1247 18.0 32.8
WR 37.80 2.89 1333 32.1 43.5
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
ageCentered20 emmean SE df lower.CL upper.CL
6.4 42.3 2.43 1301 37.5 47.1
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
ageCentered20Quadratic emmean SE df lower.CL upper.CL
51.5 42.2 2.33 1234 37.6 46.7
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
[1] 70621.85
Code
pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience <- lmerTest::lmer(
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic + years_of_experience + (1 + ageCentered20 | player_idFactor),
data = player_stats_seasonal_offense_subset,
REML = FALSE,
control = lmerControl(optimizer = "bobyqa")
)
summary(pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
method [lmerModLmerTest]
Formula:
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic +
positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic +
years_of_experience + (1 + ageCentered20 | player_idFactor)
Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
70592.4 70727.8 -35276.2 70552.4 6425
Scaled residuals:
Min 1Q Median 3Q Max
-6.7526 -0.4415 -0.1012 0.3834 4.7239
Random effects:
Groups Name Variance Std.Dev. Corr
player_idFactor (Intercept) 3856.96 62.104
ageCentered20 55.69 7.463 -0.58
Residual 2140.80 46.269
Number of obs: 6445, groups: player_idFactor, 1292
Fixed effects:
Estimate Std. Error df
(Intercept) -14.41747 26.71450 4443.49841
positionFactorQB 61.85599 28.07863 4206.42470
positionFactorRB 59.26821 27.68023 4341.88102
positionFactorTE 18.66483 27.89324 4337.42684
positionFactorWR 27.05937 27.35504 4381.59101
ageCentered20 6.92672 7.48505 5052.03481
ageCentered20Quadratic -0.89662 0.50892 4506.71231
years_of_experience 5.98527 1.05344 2059.25177
positionFactorQB:ageCentered20 7.15606 7.64964 4911.53510
positionFactorRB:ageCentered20 1.41150 7.76561 4923.15762
positionFactorTE:ageCentered20 -0.95441 7.73630 4933.84264
positionFactorWR:ageCentered20 5.38514 7.64072 4945.56253
positionFactorQB:ageCentered20Quadratic -0.47891 0.51694 4525.86739
positionFactorRB:ageCentered20Quadratic -0.62677 0.53771 4423.94494
positionFactorTE:ageCentered20Quadratic 0.06362 0.52872 4475.97907
positionFactorWR:ageCentered20Quadratic -0.65946 0.52493 4480.80773
t value Pr(>|t|)
(Intercept) -0.540 0.5894
positionFactorQB 2.203 0.0277 *
positionFactorRB 2.141 0.0323 *
positionFactorTE 0.669 0.5034
positionFactorWR 0.989 0.3226
ageCentered20 0.925 0.3548
ageCentered20Quadratic -1.762 0.0782 .
years_of_experience 5.682 1.52e-08 ***
positionFactorQB:ageCentered20 0.935 0.3496
positionFactorRB:ageCentered20 0.182 0.8558
positionFactorTE:ageCentered20 -0.123 0.9018
positionFactorWR:ageCentered20 0.705 0.4810
positionFactorQB:ageCentered20Quadratic -0.926 0.3543
positionFactorRB:ageCentered20Quadratic -1.166 0.2438
positionFactorTE:ageCentered20Quadratic 0.120 0.9042
positionFactorWR:ageCentered20Quadratic -1.256 0.2091
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
R2m R2c
[1,] 0.1650522 0.6694687
Code
positionFactor emmean SE df lower.CL upper.CL
FB 11.3 9.30 1351 -6.97 29.5
QB 94.3 4.65 1109 85.17 103.4
RB 47.3 3.65 1374 40.17 54.5
TE 27.1 3.73 1250 19.80 34.4
WR 38.9 2.85 1331 33.28 44.5
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
ageCentered20 emmean SE df lower.CL upper.CL
6.4 43.8 2.4 1308 39.1 48.5
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
ageCentered20Quadratic emmean SE df lower.CL upper.CL
51.5 43.8 2.4 1308 39.1 48.5
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
years_of_experience emmean SE df lower.CL upper.CL
4.6 43.8 2.4 1308 39.1 48.5
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
[1] 70592.39
11.3.3.3.3 Compare Models
Code
Code
mixedModels <- list(
"null" = pointsPerSeason_nullModel,
"position" = pointsPerSeason_position,
"fixedLinear" = pointsPerSeason_positionAgeFixedLinearSlopes,
"randomLinear" = pointsPerSeason_positionAgeRandomLinearSlopes,
"randomLinearFixedQuadratic" = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopes,
"randomLinearFixedQuadraticInteraction" = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteraction
)
aictab(mixedModels)11.3.3.3.4 Generalized Additive Model
[1] 4
Code
pointsPerSeason_gam <- bam( # using bam() instead of gam() for faster estimation due to large size of data
fantasy_points ~ positionFactor + s(ageCentered20, by = positionFactor) + years_of_experience + s(player_idFactor, ageCentered20, bs = "re"),
data = player_stats_seasonal_offense_subset,
nthreads = num_cores
)
pointsPerSeason_gamSummary <- summary(pointsPerSeason_gam)
pointsPerSeason_gamSummary
Family: gaussian
Link function: identity
Formula:
fantasy_points ~ positionFactor + s(ageCentered20, by = positionFactor) +
years_of_experience + s(player_idFactor, ageCentered20, bs = "re")
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -7.1108 10.2247 -0.695 0.48680
positionFactorQB 87.7506 10.7149 8.190 3.23e-16 ***
positionFactorRB 28.7960 10.2812 2.801 0.00511 **
positionFactorTE 12.6026 10.2924 1.224 0.22083
positionFactorWR 22.6299 9.9928 2.265 0.02357 *
years_of_experience 4.2443 0.9271 4.578 4.80e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(ageCentered20):positionFactorFB 1.761 2.221 2.488 0.0735 .
s(ageCentered20):positionFactorQB 5.143 6.310 34.597 <2e-16 ***
s(ageCentered20):positionFactorRB 4.892 5.905 35.700 <2e-16 ***
s(ageCentered20):positionFactorTE 4.142 5.146 14.293 <2e-16 ***
s(ageCentered20):positionFactorWR 5.288 6.300 39.727 <2e-16 ***
s(player_idFactor,ageCentered20) 880.138 1287.000 5.070 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R-sq.(adj) = 0.583 Deviance explained = 64.1%
fREML = 35673 Scale est. = 2784.9 n = 6445
[1] 0.5828168
R2m R2c
[1,] 0.5574201 0.5574201
[1] 70254.43
11.3.3.3.5 Players Who Were (at Least Once) at the Top of the End-of-Season Depth Chart
Code
pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience <- lmerTest::lmer(
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic + positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic + years_of_experience + (1 + ageCentered20 | player_idFactor),
data = player_stats_seasonal_offense_subset,
control = lmerControl(optimizer = "bobyqa")
)
summary(pointsPerSeasonDepth_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience)Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula:
fantasy_points ~ positionFactor + ageCentered20 + ageCentered20Quadratic +
positionFactor:ageCentered20 + positionFactor:ageCentered20Quadratic +
years_of_experience + (1 + ageCentered20 | player_idFactor)
Data: player_stats_seasonal_offense_subset
Control: lmerControl(optimizer = "bobyqa")
REML criterion at convergence: 70526.5
Scaled residuals:
Min 1Q Median 3Q Max
-6.7527 -0.4422 -0.1011 0.3827 4.7192
Random effects:
Groups Name Variance Std.Dev. Corr
player_idFactor (Intercept) 3892.71 62.392
ageCentered20 56.65 7.527 -0.58
Residual 2142.60 46.288
Number of obs: 6445, groups: player_idFactor, 1292
Fixed effects:
Estimate Std. Error df
(Intercept) -14.50932 26.75909 4424.09802
positionFactorQB 61.83156 28.12724 4186.26252
positionFactorRB 59.23635 27.72713 4322.39098
positionFactorTE 18.69491 27.94051 4317.87428
positionFactorWR 27.03245 27.40111 4362.14111
ageCentered20 6.96089 7.49658 5041.41649
ageCentered20Quadratic -0.89892 0.50977 4503.24842
years_of_experience 5.97746 1.05605 2050.13256
positionFactorQB:ageCentered20 7.16792 7.66153 4900.34818
positionFactorRB:ageCentered20 1.43053 7.77761 4912.87816
positionFactorTE:ageCentered20 -0.96261 7.74822 4923.54745
positionFactorWR:ageCentered20 5.40171 7.65247 4935.17920
positionFactorQB:ageCentered20Quadratic -0.48061 0.51780 4522.63658
positionFactorRB:ageCentered20Quadratic -0.62953 0.53863 4421.51275
positionFactorTE:ageCentered20Quadratic 0.06391 0.52961 4473.57721
positionFactorWR:ageCentered20Quadratic -0.66176 0.52582 4477.90547
t value Pr(>|t|)
(Intercept) -0.542 0.5877
positionFactorQB 2.198 0.0280 *
positionFactorRB 2.136 0.0327 *
positionFactorTE 0.669 0.5035
positionFactorWR 0.987 0.3239
ageCentered20 0.929 0.3532
ageCentered20Quadratic -1.763 0.0779 .
years_of_experience 5.660 1.72e-08 ***
positionFactorQB:ageCentered20 0.936 0.3495
positionFactorRB:ageCentered20 0.184 0.8541
positionFactorTE:ageCentered20 -0.124 0.9011
positionFactorWR:ageCentered20 0.706 0.4803
positionFactorQB:ageCentered20Quadratic -0.928 0.3534
positionFactorRB:ageCentered20Quadratic -1.169 0.2426
positionFactorTE:ageCentered20Quadratic 0.121 0.9040
positionFactorWR:ageCentered20Quadratic -1.259 0.2083
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
R2m R2c
[1,] 0.1648185 0.6708484
Code
positionFactor emmean SE df lower.CL upper.CL
FB 11.3 9.30 1333 -6.99 29.5
QB 94.2 4.65 1094 85.11 103.4
RB 47.3 3.65 1356 40.10 54.4
TE 27.1 3.73 1234 19.77 34.4
WR 38.8 2.85 1313 33.22 44.4
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
ageCentered20 emmean SE df lower.CL upper.CL
6.4 43.7 2.4 1290 39 48.4
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
ageCentered20Quadratic emmean SE df lower.CL upper.CL
51.5 43.7 2.4 1290 39 48.4
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
years_of_experience emmean SE df lower.CL upper.CL
4.6 43.7 2.4 1290 39 48.4
Results are averaged over the levels of: positionFactor
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
Code
[1] 70566.51
Code
pointsPerSeasonDepth_gam <- bam( # using bam() instead of gam() for faster estimation due to large size of data
fantasy_points ~ positionFactor + s(ageCentered20, by = positionFactor) + years_of_experience + s(player_idFactor, ageCentered20, bs = "re"),
data = player_stats_seasonal_offense_subsetDepth,
nthreads = num_cores
)
pointsPerSeasonDepth_gamSummary <- summary(pointsPerSeasonDepth_gam)
pointsPerSeasonDepth_gamSummary
Family: gaussian
Link function: identity
Formula:
fantasy_points ~ positionFactor + s(ageCentered20, by = positionFactor) +
years_of_experience + s(player_idFactor, ageCentered20, bs = "re")
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.464 11.895 0.543 0.586866
positionFactorQB 112.659 12.225 9.215 < 2e-16 ***
positionFactorRB 52.338 11.810 4.431 9.58e-06 ***
positionFactorTE 27.058 11.954 2.264 0.023648 *
positionFactorWR 41.742 11.430 3.652 0.000263 ***
years_of_experience 1.072 1.181 0.908 0.363798
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(ageCentered20):positionFactorFB 1.554 1.931 0.635 0.461
s(ageCentered20):positionFactorQB 4.974 6.127 19.094 < 2e-16 ***
s(ageCentered20):positionFactorRB 4.149 5.127 18.527 < 2e-16 ***
s(ageCentered20):positionFactorTE 3.761 4.711 7.432 2.17e-06 ***
s(ageCentered20):positionFactorWR 4.783 5.802 22.841 < 2e-16 ***
s(player_idFactor,ageCentered20) 600.644 780.000 6.092 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R-sq.(adj) = 0.564 Deviance explained = 61.7%
fREML = 28625 Scale est. = 3193.5 n = 5121
[1] 0.564187
R2m R2c
[1,] 0.5405308 0.5405308
[1] 56444.14
11.3.4 Plots of Model-Implied Fantasy Points by Position and Age
Code
# Create newdata object
pointsPerSeason_positionAge_newData <- expand.grid(
positionFactor = factor(c("FB","QB","RB","TE","WR")), #,"K"
age = seq(from = 20, to = 40, length.out = 10000)
)
pointsPerSeason_positionAge_newData$ageCentered20 <- pointsPerSeason_positionAge_newData$age - 20
pointsPerSeason_positionAge_newData$ageCentered20Quadratic <- pointsPerSeason_positionAge_newData$ageCentered20 ^ 2
pointsPerSeason_positionAge_newData$years_of_experience <- floor(pointsPerSeason_positionAge_newData$age - 22) # assuming that most players start at age 22 (i.e., rookie year) and thus have 1 year of experience at age 23
pointsPerSeason_positionAge_newData$years_of_experience[which(pointsPerSeason_positionAge_newData$years_of_experience < 0)] <- 0
# From Quadratic Model: All Players
pointsPerSeason_positionAge_newData$fantasyPoints_quadratic <- predict(
object = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience,
newdata = pointsPerSeason_positionAge_newData,
re.form = NA
)
# From Quadratic Model: Players at Top of End-of-Season Depth Chart
pointsPerSeason_positionAge_newData$fantasyPoints_depthQuadratic <- predict(
object = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience,
newdata = pointsPerSeason_positionAge_newData,
re.form = NA
)
# From GAM Model: All Players
pointsPerSeason_positionAge_newData$fantasyPoints_gam <- predict(
object = pointsPerSeason_gam,
newdata = pointsPerSeason_positionAge_newData,
newdata.guaranteed = TRUE,
exclude = "s(player_idFactor,ageCentered20)"
)
# From GAM Model: Players at Top of End-of-Season Depth Chart
pointsPerSeason_positionAge_newData$fantasyPoints_depthGAM <- predict(
object = pointsPerSeasonDepth_gam,
newdata = pointsPerSeason_positionAge_newData,
newdata.guaranteed = TRUE,
exclude = "s(player_idFactor,ageCentered20)"
)Plots of model-implied fantasy points by position and age are in Figures 11.11–11.14.
11.3.4.1 Quadratic Model
Code
ggplot2::ggplot(
data = pointsPerSeason_positionAge_newData,
mapping = aes(
x = age,
y = fantasyPoints_quadratic,
color = positionFactor
)
) +
geom_smooth() +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age and Position",
subtitle = "Quadratic Model with All Players"
) +
theme_classic()
11.3.4.2 Quadratic Model: Top of Depth Chart
Code
ggplot2::ggplot(
data = pointsPerSeason_positionAge_newData,
mapping = aes(
x = age,
y = fantasyPoints_depthQuadratic,
color = positionFactor
)
) +
geom_smooth() +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age and Position",
subtitle = "Quadratic Model with Players Who Were Once at Top of End-of-Season Depth Chart"
) +
theme_classic()
11.3.4.3 Generalized Additive Model
Code
ggplot2::ggplot(
data = pointsPerSeason_positionAge_newData,
mapping = aes(
x = age,
y = fantasyPoints_gam,
color = positionFactor
)
) +
geom_smooth() +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age and Position",
subtitle = "Generalized Additive Model with All Players"
) +
theme_classic()
11.3.4.4 Generalized Additive Model: Top of Depth Chart
Code
ggplot2::ggplot(
data = pointsPerSeason_positionAge_newData,
mapping = aes(
x = age,
y = fantasyPoints_depthGAM,
color = positionFactor
)
) +
geom_smooth() +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age and Position",
subtitle = "Generalized Additive Model with Players Who Were Once at Top of End-of-Season Depth Chart"
) +
theme_classic()
11.3.5 Plots of Individuals’ Model-Implied Fantasy Points by Age
Code
player_stats_seasonal_offense_subsetCC <- player_stats_seasonal_offense_subset %>%
filter(!is.na(player_idFactor), !is.na(fantasy_points), !is.na(positionFactor), !is.na(ageCentered20), !is.na(ageCentered20Quadratic), !is.na(years_of_experience))
player_stats_seasonal_offense_subsetCC$fantasyPoints_quadratic <- predict(
object = pointsPerSeason_positionAgeRandomLinearFixedQuadraticSlopesInteractionExperience,
newdata = player_stats_seasonal_offense_subsetCC
)
player_stats_seasonal_offense_subsetCC$fantasyPoints_gam <- predict(
object = pointsPerSeason_gam,
newdata = player_stats_seasonal_offense_subsetCC
)Code
zeroAge <- pointsPerSeason_positionAge_newData %>%
group_by(positionFactor) %>%
filter(fantasyPoints_gam < 0) %>%
slice(which.min(age))
peakAge <- pointsPerSeason_positionAge_newData %>%
group_by(positionFactor) %>%
slice(which.max(fantasyPoints_gam))
peakAge2 <- pointsPerSeason_positionAge_newData %>%
filter(age > 22) %>%
group_by(positionFactor) %>%
slice(which.max(fantasyPoints_gam))
qbPeakAge <- round(peakAge$age[which(peakAge$positionFactor == "QB")], 0)
fbPeakAge <- round(peakAge$age[which(peakAge$positionFactor == "FB")], 0)
rbPeakAge <- round(peakAge$age[which(peakAge$positionFactor == "RB")], 0)
wrPeakAge <- round(peakAge$age[which(peakAge$positionFactor == "WR")], 0)
wrPeakAge2 <- round(peakAge2$age[which(peakAge$positionFactor == "WR")], 0)
tePeakAge <- round(peakAge$age[which(peakAge$positionFactor == "TE")], 0)
qbZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "QB")], 0)
fbZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "FB")], 0)
rbZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "RB")], 0)
wrZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "WR")], 0)
teZeroAge <- round(zeroAge$age[which(zeroAge$positionFactor == "TE")], 0)11.3.5.1 Quarterbacks
A plot of Quarterbacks’ model-implied fantasy points by age is in Figure 11.15. The model-implied peak of Quarterbacks’ fantasy points is at age 20. The model-predicted value of zero fantasy points for Quarterbacks is at 36.
Code
plot_individualFantasyPointsByAgeQB <- ggplot(
data = player_stats_seasonal_offense_subsetCC %>% filter(position == "QB"),
mapping = aes(
x = age,
y = fantasyPoints_gam,
group = player_id)) +
geom_smooth(
aes(
text = player_display_name # add player name for mouse over tooltip
),
se = FALSE,
linewidth = 0.5,
color = "black") +
geom_smooth(
mapping = aes(
x = age,
y = fantasyPoints_gam
),
data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "QB"),
inherit.aes = FALSE,
se = TRUE,
linewidth = 2
) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Quarterbacks"
) +
theme_classic()
ggplotly(
plot_individualFantasyPointsByAgeQB,
tooltip = c("age","fantasyPoints_gam","text"))11.3.5.2 Fullbacks
A plot of Fullbacks’ model-implied fantasy points by age is in Figure 11.16. The model-implied peak of Fullbacks’ fantasy points is at age 26. The model-predicted value of zero fantasy points for Fullbacks is at 32.
Code
plot_individualFantasyPointsByAgeFB <- ggplot(
data = player_stats_seasonal_offense_subsetCC %>% filter(position == "FB"),
mapping = aes(
x = age,
y = fantasyPoints_gam,
group = player_id)) +
geom_smooth(
aes(
text = player_display_name # add player name for mouse over tooltip
),
se = FALSE,
linewidth = 0.5,
color = "black") +
geom_smooth(
mapping = aes(
x = age,
y = fantasyPoints_gam
),
data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "FB"),
inherit.aes = FALSE,
se = TRUE,
linewidth = 2
) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Fullbacks"
) +
theme_classic()
ggplotly(
plot_individualFantasyPointsByAgeFB,
tooltip = c("age","fantasyPoints_gam","text"))11.3.5.3 Running Backs
A plot of Running Backs’ model-implied fantasy points by age is in Figure 11.17. The model-implied peak of Running Backs’ fantasy points is at age 20. The model-predicted value of zero fantasy points for Running Backs is at 30.
Code
plot_individualFantasyPointsByAgeRB <- ggplot(
data = player_stats_seasonal_offense_subsetCC %>% filter(position == "RB"),
mapping = aes(
x = age,
y = fantasyPoints_gam,
group = player_id)) +
geom_smooth(
aes(
text = player_display_name # add player name for mouse over tooltip
),
se = FALSE,
linewidth = 0.5,
color = "black") +
geom_smooth(
mapping = aes(
x = age,
y = fantasyPoints_gam
),
data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "RB"),
inherit.aes = FALSE,
se = TRUE,
linewidth = 2
) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Running Backs"
) +
theme_classic()
ggplotly(
plot_individualFantasyPointsByAgeRB,
tooltip = c("age","fantasyPoints_gam","text"))11.3.5.4 Wide Receivers
A plot of Wide Receivers’ model-implied fantasy points by age is in Figure 11.18. The model-implied peaks of Wide Receivers’ fantasy points are at ages 20 and 26. The model-predicted value of zero fantasy points for Wide Receivers is at 30.
Code
plot_individualFantasyPointsByAgeWR <- ggplot(
data = player_stats_seasonal_offense_subsetCC %>% filter(position == "WR"),
mapping = aes(
x = age,
y = fantasyPoints_gam,
group = player_id)) +
geom_smooth(
aes(
text = player_display_name # add player name for mouse over tooltip
),
se = FALSE,
linewidth = 0.5,
color = "black") +
geom_smooth(
mapping = aes(
x = age,
y = fantasyPoints_gam
),
data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "WR"),
inherit.aes = FALSE,
se = TRUE,
linewidth = 2
) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Wide Receivers"
) +
theme_classic()
ggplotly(
plot_individualFantasyPointsByAgeWR,
tooltip = c("age","fantasyPoints_gam","text"))11.3.5.5 Tight Ends
A plot of Tight Ends’ model-implied fantasy points by age is in Figure 11.19. The model-implied peak of Tight Ends’ fantasy points is at age 26. The model-predicted value of zero fantasy points for Tight Ends is at 32.
Code
plot_individualFantasyPointsByAgeTE <- ggplot(
data = player_stats_seasonal_offense_subsetCC %>% filter(position == "TE"),
mapping = aes(
x = age,
y = fantasyPoints_gam,
group = player_id)) +
geom_smooth(
aes(
text = player_display_name # add player name for mouse over tooltip
),
se = FALSE,
linewidth = 0.5,
color = "black") +
geom_smooth(
mapping = aes(
x = age,
y = fantasyPoints_gam
),
data = pointsPerSeason_positionAge_newData %>% filter(positionFactor == "TE"),
inherit.aes = FALSE,
se = TRUE,
linewidth = 2
) +
labs(
x = "Player Age (years)",
y = "Fantasy Points (Season)",
title = "Fantasy Points (Season) by Player Age: Tight Ends"
) +
theme_classic()
ggplotly(
plot_individualFantasyPointsByAgeTE,
tooltip = c("age","fantasyPoints_gam","text"))11.4 Conclusion
Mixed models allow accounting for multiple levels or units of analysis and to include both fixed and random effects. Inclusion of random effects allows effects of a predictor to vary as a function of individuals in the grouping level. This allows for more accurately predicting phenomena. We applied mixed models with random intercepts and random slopes to allow our model estimates to account for the fact that different players have different starting points (intercepts) and different changes over time (slopes) in fantasy points. A quadratic, inverted-U-shaped form as a function of age fit better than a linear form as a function of age in predicting players’ fantasy points. A generalized additive model that allows further nonlinearity fit even better than the quadratic model.
Based on the bivariate scatterplots between age and fantasy points, we might conclude that players tend to stay stable or even increase in fantasy points with age. However, we would be wrong. When we account for the longitudinal data (i.e., multiple observations over time for the same player) using mixed models, we observe that fantasy points tend to decrease with age, with the timing and rate of decline differing for each position. In other words, the association between age and fantasy points differs at the person level versus the group level. This is an example of Simpson’s paradox. The discrepancy between the positive or null association between age and fantasy points at the group level, and the negative association at the person level may be due, in part, to the selective attrition of players with age. The highest performing players will tend to play the longest, whereas the poorest performing players will retire or get dropped from the team at the youngest ages. Thus, the selective attrition may make it seem that there is no association between age and performance (or even a positive one!), when in fact, players’ performance tends to decrease with age after age 26 or so (with the timing differing from position to position).
In sum, mixed models are valuable for examining associations between variables when there are multiple levels (i.e., clustering or nesting). It is important not to confuse the association at one level (e.g., group level) with the association at another level (e.g., person level), which is an example of the ecological fallacy.
11.5 Session Info
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
time zone: UTC
tzcode source: system (glibc)
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
[5] purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[9] tidyverse_2.0.0 viridis_0.6.5 viridisLite_0.4.2 plotly_4.10.4
[13] ggplot2_3.5.1 AICcmodavg_2.3-3 mgcv_1.9-1 nlme_3.1-164
[17] sjstats_0.19.0 emmeans_1.10.3 MuMIn_1.48.4 lmerTest_3.1-3
[21] lme4_1.1-35.5 Matrix_1.7-0 petersenlab_1.0.3
loaded via a namespace (and not attached):
[1] DBI_1.2.3 mnormt_2.1.1 gridExtra_2.3
[4] rlang_1.1.4 magrittr_2.0.3 compiler_4.4.1
[7] vctrs_0.6.5 reshape2_1.4.4 quadprog_1.5-8
[10] pkgconfig_2.0.3 fastmap_1.2.0 backports_1.5.0
[13] labeling_0.4.3 pbivnorm_0.6.0 utf8_1.2.4
[16] rmarkdown_2.27 tzdb_0.4.0 nloptr_2.1.1
[19] xfun_0.46 jsonlite_1.8.8 VGAM_1.1-11
[22] psych_2.4.6.26 broom_1.0.6 lavaan_0.6-18
[25] cluster_2.1.6 R6_2.5.1 stringi_1.8.4
[28] RColorBrewer_1.1-3 boot_1.3-30 rpart_4.1.23
[31] numDeriv_2016.8-1.1 estimability_1.5.1 Rcpp_1.0.13
[34] knitr_1.48 base64enc_0.1-3 splines_4.4.1
[37] nnet_7.3-19 timechange_0.3.0 tidyselect_1.2.1
[40] rstudioapi_0.16.0 yaml_2.3.10 lattice_0.22-6
[43] plyr_1.8.9 withr_3.0.0 coda_0.19-4.1
[46] evaluate_0.24.0 foreign_0.8-86 survival_3.6-4
[49] pillar_1.9.0 checkmate_2.3.1 stats4_4.4.1
[52] insight_0.20.2 generics_0.1.3 mix_1.0-12
[55] hms_1.1.3 munsell_0.5.1 scales_1.3.0
[58] minqa_1.2.7 xtable_1.8-4 glue_1.7.0
[61] unmarked_1.4.1 Hmisc_5.1-3 lazyeval_0.2.2
[64] tools_4.4.1 data.table_1.15.4 mvtnorm_1.2-5
[67] grid_4.4.1 mitools_2.4 crosstalk_1.2.1
[70] datawizard_0.12.2 colorspace_2.1-1 performance_0.12.2
[73] htmlTable_2.4.3 Formula_1.2-5 cli_3.6.3
[76] fansi_1.0.6 gtable_0.3.5 digest_0.6.36
[79] pbkrtest_0.5.3 farver_2.1.2 htmlwidgets_1.6.4
[82] htmltools_0.5.8.1 lifecycle_1.0.4 httr_1.4.7
[85] MASS_7.3-60.2